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1. Consider the problem of predicting how well a student does in her second year of 
college/university, given how well they did in their first year. Specifically, let x be 
equal to the number of "A" grades (including A-. A and A+ grades) that a student 
receives in their first year of college (freshmen year). We would like to predict the value 
of y, which we define as the number of "A" grades they get in their second year. 

Use the following training set of a small sample of different students' performances. Here 
each row is one training example. 


Recall that in linear regression, the hypothesis is 

he(x) = 6> 0 + O x x 
, and we use m to denote the number of training examples. 


a) For the training set given above, what is the value of ml 

b) Recall the definition of the cost function What is J(0, 1 )1 

TTl cy 

J (0o,0i) = ^ E (M® (<) ) - y {i) ) 

i— 1 

c) Suppose we set 6o=0 and 6i=1.5. What is he(2)l 

2. Suppose you have m=14 training examples with n= 3 features (excluding the additional 
all-ones feature for the intercept term, which you should add). The normal equation is 

0 = (JC ri JC) 1 

For the given values of m and n, what are the dimensions of 0, X, and y in this equation? 

Use Normal Equation to solve for 9 

3. Suppose you have a dataset with m= 1000000 examples and n=l5 features for each 
example. You want to use multivariate linear regression to fit the parameters to data. 
Should you prefer gradient descent or the normal equation? Why? 

4. Suppose you have a dataset with m=50 examples and n=2()()000 features for each 
example. You want to use multivariate linear regression to fit the parameters to data. 
Should you prefer gradient descent or the normal equation? Why? 


5. Suppose students m = 4 have taken some class, and the class had a midterm exam and a 
final exam. You have collected a dataset of their scores on the two exams, which is as 
follows: 


midterm exam 

0 

(midterm exam) 

final exam 

89 

7921 

96 

72 

5184 

74 

94 

8836 

87 

69 

4761 

78 


You'd like to use polynomial regression to predict a student's final exam score from their 
midterm exam score. Concretely, suppose you want to fit a model of the form 

he{x) = 6q + 9 \X\ + O2X2 , where x\ is the midterm score and X2 is (midterm score) 2 . 
Further, you plan to use both feature scaling (dividing by the "max-min", or range, of a feature) 
and mean normalization. 

( 2) 

What is the normalized feature 2 ? (Hint: midterm = 89, final = 96 is training example 1.) 
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